AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Inference optimization

# Inference optimization

Llama 3.1 Nemotron Nano 4B V1.1 GGUF
Other
Llama-3.1-Nemotron-Nano-4B-v1.1 is a large language model optimized based on Llama 3.1, achieving a good balance between accuracy and efficiency. It is suitable for various scenarios such as AI agents and chatbots.
Large Language Model Transformers English
L
Mungert
2,177
1
Qwq 32B FP8 Dynamic
MIT
FP8 quantized version of QwQ-32B, reducing storage and memory requirements by 50% through dynamic quantization while maintaining 99.75% of the original model accuracy
Large Language Model Transformers
Q
nm-testing
3,895
3
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase